{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ymsq1Lw0VEqT"
      },
      "source": [
        "# Agentic Data Model Generation and Structured Output Powered by CAMEL & Qwen"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7V3aV16AmY0K"
      },
      "source": [
        "You can also check this cookbook in colab [here](https://colab.research.google.com/drive/18E6W05FlykjqptWVMwCQGJoG2WyaY47A?usp=sharing)  (Use the colab share link)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YCjIggokqhvp"
      },
      "source": [
        "<div class=\"align-center\">\n",
        "  <a href=\"https://www.camel-ai.org/\"><img src=\"https://i.postimg.cc/KzQ5rfBC/button.png\"width=\"150\"></a>\n",
        "  <a href=\"https://discord.camel-ai.org\"><img src=\"https://i.postimg.cc/L4wPdG9N/join-2.png\"  width=\"150\"></a></a>\n",
        "  \n",
        "⭐ <i>Star us on [*Github*](https://github.com/camel-ai/camel), join our [*Discord*](https://discord.camel-ai.org) or follow our [*X*](https://x.com/camelaiorg)\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "O4XWsJes0Zfa"
      },
      "source": [
        "![py 3 (1).png]()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "G5gE04UuPUWj"
      },
      "source": [
        "This notebook demonstrates how to set up and leverage CAMEL's ability of structure output, like JSON, and Pydantic objects.\n",
        "\n",
        "In this notebook, you'll explore:\n",
        "\n",
        "*   **CAMEL**: A powerful multi-agent framework that enables Retrieval-Augmented Generation and multi-agent role-playing scenarios, allowing for sophisticated AI-driven tasks.\n",
        "*   **Structure output**: The ability of LLMs to return structured output.\n",
        "*   **Qwen**: The Qwen model is a series of LLMs and multimodal models developed by the Qwen Team at Alibaba Group. Designed for diverse scenarios, Qwen integrates advanced AI capabilities, such as natural language understanding, text and vision processing, programming assistance, and dialogue simulation.\n",
        "\n",
        "This setup not only demonstrates a practical application but also serves as a flexible framework that can be adapted for various scenarios requiring structure output and data generation."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0J0_iW-YVcq2"
      },
      "source": [
        "## 📦 Installation"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7p-JjpyNVcCT"
      },
      "source": [
        "First, install the CAMEL package with all its dependencies:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "0GXs2pruU9Vl"
      },
      "outputs": [],
      "source": [
        "!pip install \"camel-ai==0.2.16\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lfNvFbhD6o8B"
      },
      "source": [
        "## 🔑 Setting Up API Keys"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qfQwhEMDuYNl"
      },
      "source": [
        "Your can go to [here](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api/) to get API Key from Qwen AI."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "tGtZnvmntwdj",
        "outputId": "f371a5dd-d8f9-4972-e666-b87cfc58453e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Enter your API key: ··········\n"
          ]
        }
      ],
      "source": [
        "# Prompt for the API key securely\n",
        "import os\n",
        "from getpass import getpass\n",
        "\n",
        "qwen_api_key = getpass('Enter your API key: ')\n",
        "os.environ[\"QWEN_API_KEY\"] = qwen_api_key"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Alternatively, if running on Colab, you could save your API keys and tokens as **Colab Secrets**, and use them across notebooks.\n",
        "\n",
        "To do so, **comment out** the above **manual** API key prompt code block(s), and **uncomment** the following codeblock.\n",
        "\n",
        "⚠️ Don't forget granting access to the API key you would be using to the current notebook."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# import os\n",
        "# from google.colab import userdata\n",
        "\n",
        "# os.environ[\"QWEN_API_KEY\"] = userdata.get(\"QWEN_API_KEY\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NEUciNquON9_"
      },
      "source": [
        "## Qwen data generation"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6f64VOMMP93d"
      },
      "source": [
        "In this section, we'll demonstrate how to Qwen to generate structured data. [Qwen](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api) is a good example in Camel of using prompt engineering for structure output. It offers powerful models like **Qwen-max**, **Qwen-coder**, but yet not support structure output by itself. We can then make use of its own ability to generate structured data."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "46Irp_SurLaV"
      },
      "source": [
        "Import necessary libraries, define the Qwen agent, and define the Pydantic classes."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QVB-Xra8QIU1"
      },
      "source": [
        "The following function retrieves relevant information from a list of URLs based on a given query. It combines web scraping with Firecrawl and CAMEL's AutoRetriever for a seamless information retrieval process. (Some explanation)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "gE_qBFCVveBR"
      },
      "outputs": [],
      "source": [
        "from pydantic import BaseModel, Field\n",
        "\n",
        "from camel.agents import ChatAgent\n",
        "from camel.messages import BaseMessage\n",
        "from camel.models import ModelFactory\n",
        "from camel.types import ModelPlatformType, ModelType\n",
        "from camel.configs import QwenConfig"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "jnVCqRIS9snF"
      },
      "outputs": [],
      "source": [
        "# Define Qwen model\n",
        "qwen_model = ModelFactory.create(\n",
        "    model_platform=ModelPlatformType.QWEN,\n",
        "    model_type=ModelType.QWEN_CODER_TURBO,\n",
        "    model_config_dict=QwenConfig().as_dict(),\n",
        ")\n",
        "\n",
        "qwen_agent = ChatAgent(\n",
        "    model=qwen_model,\n",
        "    message_window_size=10,\n",
        ")\n",
        "\n",
        "\n",
        "# Define Pydantic models\n",
        "class Student(BaseModel):\n",
        "    name: str\n",
        "    age: str\n",
        "    email: str"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1g31Q_CUCSZ3"
      },
      "source": [
        "First, let's try if we don't specific format just in prompt.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "KX_Ojed_CRtx",
        "outputId": "1534832d-0c6d-408f-ae13-b7297452708d"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Sure! Here is an example of a student's information in JSON format:\n",
            "\n",
            "```json\n",
            "{\n",
            "    \"name\": \"John Doe\",\n",
            "    \"age\": \"20\",\n",
            "    \"email\": \"johndoe@example.com\"\n",
            "}\n",
            "```\n",
            "\n",
            "Feel free to replace the values with actual data for your specific use case!\n"
          ]
        }
      ],
      "source": [
        "assistant_sys_msg = BaseMessage.make_assistant_message(\n",
        "    role_name=\"Assistant\",\n",
        "    content=\"You are a helpful assistant in helping user to generate necessary data information.\",\n",
        ")\n",
        "\n",
        "user_msg = \"\"\"Help me 1 student info in JSON format, with the following format:\n",
        "{\n",
        "    \"name\": \"string\",\n",
        "    \"age\": \"string\",\n",
        "    \"email\": \"string\"\n",
        "}\"\"\"\n",
        "\n",
        "response = qwen_agent.step(user_msg)\n",
        "print(response.msgs[0].content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PpyFnTWlHc7j"
      },
      "source": [
        "It did it, but we need to expand our prompts, and the result still has some annoying extra texts, and we still need to parse it into valid JSON object by ourselves.\n",
        "\n",
        "A more elegant way is to use the `response_format` argument in `.step()` function:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "zu1DZw0HCfo_",
        "outputId": "0f65fc6e-b5eb-4718-9a1b-16e2f7b701fe"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "{\n",
            "  \"name\": \"John Doe\",\n",
            "  \"age\": \"20\",\n",
            "  \"email\": \"johndoe@example.com\"\n",
            "}\n"
          ]
        }
      ],
      "source": [
        "qwen_agent.reset()\n",
        "user_msg = \"Help me 1 student info in JSON format\"\n",
        "response = qwen_agent.step(user_msg, response_format=Student)\n",
        "print(response.msgs[0].content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "52xqKSa3CaRH"
      },
      "source": [
        "And we can directly extract the Pydantic object in `response.msgs[0].parsed` field:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "VEn6YLxPC8nu",
        "outputId": "3dca012a-39ca-43e7-a5c7-f5fc15763620"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "<class '__main__.Student'>\n",
            "name='John Doe' age='20' email='johndoe@example.com'\n"
          ]
        }
      ],
      "source": [
        "print(type(response.msgs[0].parsed))\n",
        "print(response.msgs[0].parsed)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ly2R-RegHo1o"
      },
      "source": [
        "Hooray, now we successfully generate 1 entry of student, suppose we want to generate more, we can still achieve this easily."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "MuiZIKq0Hxp1",
        "outputId": "bc5235e2-6aba-4536-e83f-ae9c3f6348e4"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "{\n",
            "  \"studentList\": [\n",
            "    {\n",
            "      \"name\": \"Alice Johnson\",\n",
            "      \"age\": \"22\",\n",
            "      \"email\": \"alice.johnson@example.com\"\n",
            "    },\n",
            "    {\n",
            "      \"name\": \"Bob Smith\",\n",
            "      \"age\": \"21\",\n",
            "      \"email\": \"bob.smith@example.com\"\n",
            "    },\n",
            "    {\n",
            "      \"name\": \"Charlie Brown\",\n",
            "      \"age\": \"23\",\n",
            "      \"email\": \"charlie.brown@example.com\"\n",
            "    },\n",
            "    {\n",
            "      \"name\": \"Diana Prince\",\n",
            "      \"age\": \"24\",\n",
            "      \"email\": \"diana.prince@example.com\"\n",
            "    },\n",
            "    {\n",
            "      \"name\": \"Eve Adams\",\n",
            "      \"age\": \"25\",\n",
            "      \"email\": \"eve.adams@example.com\"\n",
            "    }\n",
            "  ]\n",
            "}\n",
            "studentList=[Student(name='Alice Johnson', age='22', email='alice.johnson@example.com'), Student(name='Bob Smith', age='21', email='bob.smith@example.com'), Student(name='Charlie Brown', age='23', email='charlie.brown@example.com'), Student(name='Diana Prince', age='24', email='diana.prince@example.com'), Student(name='Eve Adams', age='25', email='eve.adams@example.com')]\n"
          ]
        }
      ],
      "source": [
        "class StudentList(BaseModel):\n",
        "    studentList: list[Student]\n",
        "\n",
        "user_msg = \"Help me 5 random student info in JSON format\"\n",
        "response = qwen_agent.step(user_msg, response_format=StudentList)\n",
        "print(response.msgs[0].content)\n",
        "print(response.msgs[0].parsed)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "APqHogdcH1nJ"
      },
      "source": [
        "That's it! We just generate 5 random students out of nowhere by using Qwen Camel agent!"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "flYNal6-R4yR"
      },
      "source": [
        "## 🌟 Highlights"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SmkXhy4JR726"
      },
      "source": [
        "This notebook has guided you through setting up and running Qwen chat agent and use it to generate structured data.\n",
        "\n",
        "Key tools utilized in this notebook include:\n",
        "\n",
        "*   **CAMEL**: A powerful multi-agent framework that enables Retrieval-Augmented Generation and multi-agent role-playing scenarios, allowing for sophisticated AI-driven tasks.\n",
        "*   **Qwen data generation**: Use Qwen model to generate structured data for further use of other applications.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "txBG2vveqlVS"
      },
      "source": [
        "That's everything: Got questions about 🐫 CAMEL-AI? Join us on [Discord](https://discord.camel-ai.org)! Whether you want to share feedback, explore the latest in multi-agent systems, get support, or connect with others on exciting projects, we’d love to have you in the community! 🤝\n",
        "\n",
        "Check out some of our other work:\n",
        "\n",
        "1. 🐫 Creating Your First CAMEL Agent [free Colab](https://docs.camel-ai.org/cookbooks/create_your_first_agent.html)\n",
        "\n",
        "2.  Graph RAG Cookbook [free Colab](https://colab.research.google.com/drive/1uZKQSuu0qW6ukkuSv9TukLB9bVaS1H0U?usp=sharing)\n",
        "\n",
        "3. 🧑‍⚖️ Create A Hackathon Judge Committee with Workforce [free Colab](https://colab.research.google.com/drive/18ajYUMfwDx3WyrjHow3EvUMpKQDcrLtr?usp=sharing)\n",
        "\n",
        "4. 🔥 3 ways to ingest data from websites with Firecrawl & CAMEL [free Colab](https://colab.research.google.com/drive/1lOmM3VmgR1hLwDKdeLGFve_75RFW0R9I?usp=sharing)\n",
        "\n",
        "5. 🦥 Agentic SFT Data Generation with CAMEL and Mistral Models, Fine-Tuned with Unsloth [free Colab](https://colab.research.google.com/drive/1lYgArBw7ARVPSpdwgKLYnp_NEXiNDOd-?usp=sharingg)\n",
        "\n",
        "Thanks from everyone at 🐫 CAMEL-AI\n",
        "\n",
        "\n",
        "<div class=\"align-center\">\n",
        "  <a href=\"https://www.camel-ai.org/\"><img src=\"https://i.postimg.cc/KzQ5rfBC/button.png\"width=\"150\"></a>\n",
        "  <a href=\"https://discord.camel-ai.org\"><img src=\"https://i.postimg.cc/L4wPdG9N/join-2.png\"  width=\"150\"></a></a>\n",
        "  \n",
        "⭐ <i>Star us on [*Github*](https://github.com/camel-ai/camel), join our [*Discord*](https://discord.camel-ai.org) or follow our [*X*](https://x.com/camelaiorg)\n",
        "</div>"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
